Bernoulli 18(2), 2012, 552-585 
DOI: 10.3150/11-BEJ354 



Degenerate U - and ^-statistics under weak 
dependence: Asymptotic theory and 
bootstrap consistency 

ANNE LEUCHT 

Friedrich-Schiller-Universitat Jena, Institut fur Stochastik, Ernst- Abbe- Platz 2, D-07743 Jena, 
Germany. E-mail: anne.leucht@uni-jena.de 

We devise a general result on the consistency of model-based bootstrap methods for U- and V- 
statistics under easily verifiable conditions. For that purpose, we derive the limit distributions of 
degree-2 degenerate U- and ^-statistics for weakly dependent Revalued random variables first. 
To this end, only some moment conditions and smoothness assumptions concerning the kernel 
are required. Based on this result, we verify that the bootstrap counterparts of these statistics 
have the same limit distributions. Finally, some applications to hypothesis testing are presented. 
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1. Introduction 

Numerous test statistics can be formulated or approximated in terms of degenerate U- 
or V-type statistics. Examples include the Cramer-von Mises statistic, the Anderson- 
Darling statistic or the x 2 -statistic. For i.i.d. random variables the limit distributions 
of U- and ^-statistics can be derived via a spectral decomposition of their kernel if the 
latter is squared integrable. To use the same method for dependent data, often restrictive 
assumptions are required whose validity is quite complicated or even impossible to verify 
in many cases. The first of our two main results is the derivation of the asymptotic 
distributions of U- and ^-statistics under assumptions that are fairly easy to check. This 
approach is based on a wavelet decomposition instead of a spectral decomposition of the 
kernel. 

The limit distributions for both independent and dependent observations depend on 
certain parameters which in turn depend on the underlying situation in a complicated 
way. Therefore, problems arise as soon as critical values for test statistics of U- and V-type 
have to be determined. The bootstrap offers a convenient way to circumvent these prob- 
lems; see Arcones and Ginc [2], Dchling and Mikosch [10] or Lcucht and Neumann [25] for 
the i.i.d. case. To our knowledge, there are no results concerning bootstrapping general 
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degenerate [/-statistics of non-independent observations. As a second main result of the 
paper, we establish consistency of model-based bootstrap methods for U- and V-type 
statistics of weakly dependent data. 

In order to describe the dependence structure of the sample, we do not invoke the con- 
cept of mixing although a great variety of processes satisfy these constraints and various 
tools of probability theory and statistics such as central limit theorems, probability and 
moment inequalities can be carried over from the i.i.d. setting to mixing processes. How- 
ever, these methods of measuring dependencies are inappropriate in the present context 
since not only the asymptotic behaviour of U- and y-type statistics but also bootstrap 
consistency is focused. Model-based bootstrap methods can yield samples that are no 
longer mixing even though the original sample satisfies some mixing condition. A simple 
example is presented in Section 4.2. There we consider a model-specification test within 
the class of nonlinear AR(1) processes. Under Ho, A& = go(Xk-i) + £k, where go is Lips- 
chitz contracting and (ek)k is a sequence of i.i.d. centered innovations. It is most natural 
to draw the bootstrap innovations (e£)fc via Efron's bootstrap from the recentered resid- 
uals first. Then the bootstrap counterpart of (Xk)k is generated iteratively by choosing 
an initial variable Aq independently of (e^)fc and defining A^ = go(Xt_ 1 ) + e* k . Due 
to the discreteness of the bootstrap innovations, commonly used coupling techniques to 
prove mixing properties for Markovian processes fail; see also Andrews [1]. It turns out 
that the characterization of dependence structures introduced by Dedecker and Prieur [9] 
is exceptionally suitable here. Based on their r-dependence coefficient it is possible to 
construct an Li-coupling in the following sense. Let M. denote a er-algcbra generated by 
sample variables of the "past" and let A be a random variable of a certain "future" time 
point. Then, the minimal Li-distance between A and a random variable that has the 
same distribution as A but that is independent of M. is equivalent to the r-depcndcncc 
coefficient t(M,X). 

We exploit this coupling property in order to derive the asymptotic distribution for 
the original as well as the bootstrap statistics of degenerate [/-type. Basically, both 
proofs follow the same lines. First, the (almost) Lipschitz continuous kernels of the U- 
statistics are approximated by a finite wavelet series expansion. There are two crucial 
points that assure asymptotic negligibility of the approximation error. On the one hand, 
the smoothness of the kernel function carries over to its wavelet approximation uniformly 
in scale, cf. Lemma 5.2. On the other hand, Lipschitz continuity of the kernel and the L%- 
coupling property of the underlying r-dependent sample perfectly fit together. A next 
step contains the application of a central limit theorem and the continuous mapping 
theorem to determine the limits of the approximating statistics of [/-type. Based on 
these investigations, the asymptotic distribution of the [/-statistic and its bootstrap 
counterpart is then deduced via passage to the limit. It can be expressed as an infinite 
weighted sum of normal variables. 

Our paper is organized as follows. We start with an overview of asymptotic results on 
degenerate [/-type statistics of dependent random variables. In Section 2.2, we introduce 
the underlying concept of weak dependence and derive the asymptotic distributions of U- 
and ^-statistics. On the basis of these results, we deduce consistency of general bootstrap 
methods in Section 3. Some applications of the theory to hypothesis testing are presented 
in Section 4. All proofs are deferred to a final Section 5. 
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2. Asymptotic distributions of U- and V-statistics 
2.1. Survey of literature 

Let (X„) ne N be a sequence of R d -valucd random variables with common distribution Px- 
In the case of i.i.d. random variables, the limit distributions of degenerate U- and V-type 
statistics, that is, 

i 71 1 n 

j = l k=^j j,k=l 

with /i: R d x K d R symmetric and J Rd h(x , y) Px (dx) = 0,Vy € R d , can be derived 
by using a spectral decomposition of the kernel, h(x,y) = J2h=i ^k&k(x)&k(y), which 
holds true in the L2-sense. Here, (<&k)k denote orthonormal eigcnfunctions and (Xk)k the 
corresponding eigenvalues of the integral equation 

f h(x,y)g(y)P x (dy) = \g(x). (2.1) 

JR d 

Approximate nU n by nU,[ K) = £f =1 A fe {(n-V2 $ fe (X 4 )) 2 - Zti *fc(^i)}- 
Then the sum under the round brackets is asymptotically standard normal while the 
latter sum converges in probability to 1. Finally, one obtains 

oo 

nU n A^A fe (Z, 2 -l), (2.2) 
fe=i 

where {Zk)k is a sequence of i.i.d. standard normal random variables; cf. Serfling [27]. 
If additionally E|/i(Xi,Xi)| < oo, the weak law of large numbers and Slutsky's theo- 
rem imply V„ — — > J2T=i ^k(Z% — 1) + lKh(Xi,Xi). (Here, — > denotes convergence in 
distribution.) 

So far, most previous attempts to derive the limit distributions of degenerate U- and 
^-statistics of dependent random variables arc based on the adoption of this method of 
proof. Eagleson [15] developed the asymptotic theory in the case of a strictly stationary 
sequence of (^-mixing, real-valued random variables under the assumption of absolutely 
summable eigenvalues. This condition is satisfied if the kernel function is of the form 
h(x,y) = J R h\(x, z)hi(z,y)Px(dz) and hi is squared intcgrable w.r.t. Px - Using general 
heavy-tailed weight functions instead of Px , the eigenvalues are not necessarily absolutely 
summable; see, for example, de Wet [7]. Carlstein [5] analysed [/-statistics of a-mixing, 
real- valued random variables in the case of finitely many eigcnfunctions. He derived 
a limit distribution of the form (2.2), where (Zk)keK ' 1S a sequence of centered normal 
random variables. Denker [11] considered stationary sequences (X n = f(Y n ,Y n+ i, . . .))„ of 
functionals of /3-mixing random variables (Y n ) n . He assumed / and the cumulative distri- 
bution function of X\ to be Holder continuous. Imposing some smoothness condition on h, 
the limit distribution of nU n was derived under the additional assumption ||$fe||oo < oo, 
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Vfc G N. The condition on is difficult or even impossible to check in a multitude of 

cases since this requires to solve the associated integral equation (2.1). Similar difficulties 
occur if one wants to apply the results of Dewan and Prakasa Rao [12] or Huang and 
Zhang [21]. They studied [/-statistics of associated, real- valued random variables. Be- 
sides the absolute summability of the eigenvalues, certain regularity conditions have to 
be satisfied uniformly by the eigenfunctions in order to obtain the asymptotic distribution 
of nU n - 

A different approach was used by Babbel [3] to determine the limit distribution of 
[/-statistics of <\>- and j3- mixing random variables. She deduced the limit distribution 
via a Haar wavelet decomposition of the kernel and empirical process theory with- 
out imposing the critical conditions mentioned above. However, she presumed that 
JJ h(x,y)Px k ,x k+n (dx,dy) = 0,Vfc <G Z,n e N. This assumption does in general not hold 
true within our applications in Section 3. Moreover, this approach is not suitable when 
dealing with U -statistics of r-dcpcndcnt random variables since Lipschitz continuity will 
be the crucial property of the (approximating) kernel in order to exploit the underlying 
dependence structure. 



2.2. Main results 



Let (X„) ne N be a sequence of Revalued random variables on some probability space 
(fl,A, P) with common distribution P\- In this subsection, we derive the limit distribu- 
tions of 



1 n 



and 



1 ™ 

iV n = - KXj,X k 



3=1 



j,h=l 



where h : 



x R d -> K is a symmetric function with J Rd h(x,y)Px(dx) = 0,Vy e R d . In 
order to describe the dependence structure of (X n ) ne jq, we recall the definition of the 
r-depcndence coefficient for Revalued random variables of Dcdeckcr and Prieur [9] . 



Definition 2.1. Let (VL,A,P) be a probability space, A4 a sub-a -algebra of A and X 
an M. d -valued random variable. Assume that EHXl^ < oo, where ; 1 = 5Z i=1 and 
define 



t(M,X) 



sup 

/EAi (K 



f(x)Px\M( dx ) 



f(x)P x (dx) 



Here, Px\m denotes the conditional distribution of X given Ai and Ai(M. d ) denotes the 
set of 1-Lipschitz functions from M. d to R. 



We assume 

(Al) (i) (X n )„ 6 N is a (strictly) stationary sequence of Revalued random variables 
on some probability space (fi, A, P) with common distribution Px and 

n\xi\\h<°°- 
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(ii) The sequence (tv),. 6 n, denned by 



sup{r(a(X Sl ,...,X Sn ), (X' ti , X[ 2 ,X' U )')\ 

u G N, si < ■ • • < s u < s u + r < ti < t 2 < h G N}, 



satisfies X] 
position.) 



r=l 



oo 



rr* < oo for some S G (0, 1). (Here, prime denotes the trans- 



Remark 1. If is rich enough, due to Dcdcckcr and Prieur [8] the validity of (Al) 

allows for the construction of a random vector (X' ti , X' t2 , A t ' 3 )' = (X' tl ,X' t2 ,X' t3 )' that is 
independent of X Sl , . . . , X Su and such that 



The notion of r-dependence is more general than mixing. If, for example, {X n ) n is (3- 
mixing, we obtain an upper bound for the dependence coefficient r r < 6 Jjf Q\Xt \ ( u ) dw, 
where = inf{t G M|P(||Ai >t) < G [0, 1], and /3(r) denotes the ordinary 

/3-mixing coefficient (3{r) := Esup Bg(T ( Xs>s>t+r \ teZ |P(i?|cr(X s , s < t)) — P(B)\. This is 
a consequence of Remark 2 of Dedecker and Prieur [8]. Moreover, inequality (2.3) imme- 
diately implies 



for si < • • • < s u < s u + r < t\ < ■ ■ ■ < t v G N and for all functions h : R" — > K and fc : 
K u — > E in £:={/: M p -> K for some p G N| Lipschitz continuous and bounded}. There- 
fore, a sequence of random variables that satisfies (Al) is ((r r ) r , £, , 0)-weakly dependent 
in the sense of Doukhan and Louhichi [14] with ip(h,k,u,v) = 2||/i|| 00 Lip(fc)|~ : |] . (Here 
and in the sequel, Lip(g) denotes the Lipschitz constant of a generic function g.) A list of 
examples for r-dependent processes including causal linear and functional autoregressive 
processes is provided by Dedecker and Prieur [9]. 

Besides the conditions on the dependence structure of (X n ) n ^, we make the following 
assumptions concerning the kernel: 

(A2) (i) The kernel h : M. d x M. d — > K is a symmetric, measurable function and degen- 
erate under Px, that is, J Rd h(x,y)Px(dx) = 0, Vy G M d . 
(ii) For a 5 satisfying (Al)(ii), the following moment constraints hold true with 
some v > (2 — 6)/(l — S) and an independent copy X\ of X\\ 



?, 




(2.3) 



\cov(h(X Sl ,...,X Su ),k(X tl ,...,X tv ))\<2\\h\\ 00 Lip(k) - r r 



(2.4) 



supE|/i(X 1 ,Xi +i: )| 1 ' <oo and ~E,\h{X x , X x )\ v < oo. 



(A3) The kernel h is Lipschitz continuous. 
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Using an appropriate kernel truncation, it is possible to reduce the problem of deriving 
the asymptotic distribution of nU n to statistics with bounded kernel functions. 

Lemma 2.1. Suppose that (Al), (A2), and (A3) are fulfilled. Then there exists a family 
of bounded functions (h c ) c€R + satisfying (A2) and (A3) uniformly such that 

lim sup n 2 E(U n - U n c f = 0, (2.5) 

where U 7hC = n~ 2 Y!j=i J2k=ij h c (Xj,X k ). 

After this simplification of the problem, we intend to develop a decomposition of the 
kernel that allows for the application of a central limit theorem (CLT) for weakly depen- 
dent random variables. One could try to imitate the proof of the i.i.d. case. According 
to the discussion in the previous subsection, this leads to prerequisites that can hardly 
be checked in numerous cases. Therefore, we do not use a spectral decomposition of the 
kernel but a wavelet decomposition. It turns out that Lipschitz continuity is the central 
property the kernel function should satisfy in order to exploit (2.3). For this reason, 
the choice of Haar wavelets, as they were employed by Babbcl [3], is inappropriate in 
the present situation. Instead, the application of Lipschitz continuous scale and wavelet 
functions is more suitable. 

In the sequel, let <f> and tp denote scale and wavelet functions associated with an 
one-dimensional multiresolution analysis. As illustrated by Daubechies [6], Section 8, 
these functions can be selected in such a manner that they possess the following proper- 
ties: 

(1) <f> and tf> are Lipschitz continuous, 

(2) <fi and ip have compact support, 

(3) <f>(x) dx = 1 and ^{x) dx = 0. 

It is well known that an orthonormal basis in L-2(M. d ) can be constructed from </> and ip. 
For this purpose, define E := {0, l} d \ {0d}, where 0^ denotes the d-dimensional null 
vector. In addition, set 

(i) _ f c/> for i = 0, 

\ V f° r * = i 

and define functions *^ : K d ^ K, j G Z, fc = (fci , . . . , fc d )' e Z d , by 

d 

:= 2 jd/2 n^ ei) (2 j Xi - h) Ve= (ei,.. .,e d )' eE,x = (x 1 ,.. .,*«,)' G M. d . 

i=l 

The system ( v I^ < 2) ee £ J(E z,fcez d is an orthonormal basis of L2(R d ), see Wojtaszczyk [29], 
Section 5. The same holds true for ($o,A;)fcez ti U {^jk)j>o,e£E,kei. d > where the functions 
fc :R d -> R are given by $j, k (x) := V d l 2 flti <X 2 ^ - e%,ke Z d . 
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Now, an L2-approximation of nU riiC by a statistic based on a wavelet approximation 
of h c can be established. To this end, we introduce h[ K ' L ^ with 



(2.6) 



k 1 ,k 2 e{-L,...,L} d 
J(K)-1 

+ E E E/SWt fe >^ 

j=0 k 1: k 2 e{-L L} d e£E 

where £:=(£x£)U(£x {0 d }) U {{0 d } x E), 

;= { fOT K - e 2)' € B >< 

for(e' 1 ,e' 2 ye{0 d }xS, 

"Ska = JhdxK* h d x > V)^M (x)$o,k 2 (y) dxdy and P [ ^ M = XTr'xr- f) x ^iSi.fafo 
j/)dxdy. We refer to the degenerate version of h c K as he , given by 



^(^y)^^)^^)- / hi K < L \x,y)P x {dx)- \ hi K ^(x,y)P x (dy) 

hi K ' L \x,y)P x (dx)P x (dy). 



R d xR d 

The associated [/-type statistic will be denoted by U^ C ,L ^ ■ 

Lemma 2.2. Assume that (Al), (A2j, and (A3) are fulfilled. Then the sequence of 
indices {J{K))k&i in (2.6) with J(K) — >k^oo oo can be chosen such that 

lim limsupsup7i 2 E(J7„ iC - U^ K c ' L) f = 0. 

Employing the CLT of Neumann and Paparoditis [26] and the continuous mapping 
theorem, we obtain the limit distribution of nUn K c ' L \ Finally, based on this result, the 
asymptotics of the [/-type statistic nll n can be derived. Moreover, a weak law of large 
numbers (Lemma 5.1 in Section 5.2) allows for deducing the limit distribution of nV n 
since nV n = nU n + n -1 J2k=l H x k,Xf.)- 

Before stating the main result of this section, we introduce constants A% x> % 2 := 
cov($ ,fc 1 (^i),$o,fc 2 (^i)) and 



cov^CXx),^^)) for K, e ' 2 )' ££x£, 
B f£to : = { cov^iX,),^^)) for (e[,e' 2 y eEx {0 d }, j e Z, h,k 2 G Z d . 



cov($,- fcl (X 1 ),^(X 1 )) for (e[,e' 2 y £ {0 d } x E, 
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Theorem 2.1. Suppose that the assumptions (Al), (A2), and (A3) are fulfilled. Then, 
as n — > oo , 

nU n -^Z 

with 

Z := lim V 4 c) fc [Z fcl Z fc2 - A fcl , fc2 ] 

c — ^ oo \ * — * 1 ' z 

\k 1 ,k 2 £l. d 

oo \ 
3=0 fc 1 ,fc 2 eZ d e=(e' 1 ,eyeB / 

Here, {Zk)kei, d as we ^ as (■Zj e fc)j>o,fcez< i ,ee{o.i} ci are centered and jointly normally 
distributed random variables and the r.h.s. converges in the L^-sense. If additionally 
E\h(X 1} Xi)\ <oo, then 

nV n -Uz + Eh(X u Xi). 



As in the case of i.i.d. random variables, the limit distributions of nU n and nV n are, 
up to a constant, weighted sums of products of centered normal random variables. In 
contrast to many other results in the literature, the prerequisites of this theorem, namely 
moment constraints and Lipschitz continuity of the kernel, can be checked fairly easily 
in many cases. Nevertheless, the asymptotic distribution has a complicated structure. 
Hence, quantiles can hardly be determined on the basis of the previous result. However, 
we show in the following section that the conditional distributions of the bootstrap coun- 
terparts of nU n and nV n , given X\, . . . ,X n , converge to the same limits in probability. 

Of course, the assumption of Lipschitz continuous kernels is rather restrictive. Thus, 
we extend our theory to a more general class of kernel functions. The costs for enlarging 
the class of feasible kernels are additional moment constraints. 

Besides (Al) and (A2), we assume 

(A4) (i) The kernel function satisfies 

\h(x,y) - h(x,y)\ < f(x,x,y,y)[\\x - x\\ h + \\y-y~Wh} Vx,x,y,y e R d , 
where / : R 4d — > R is continuous. Moreover, 

sup E( max [f(Y u Y 2 + a u Y^Y A + o 3 )]"||y 6 ||ji) < oo 

Yl,...,Y s ~Px X -a 1 ,a 2 e[-A.A] d / 

for 77 := 1/(1 — 5) with 5 satisfying (A2) and some A > 0. 
(ii) EZir(r r f <oo. 

Even though the assumption (A4)(i) has a rather technical structure, it is satisfied for 
example, by polynomial kernel functions as long as the sample variables have sufficiently 
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many finite moments. Analogous to Lemma 2.1 and Lemma 2.2, the following assertion 
holds. 

Lemma 2.3. Suppose that (Al), (A2), and (A4) are fulfilled. Then a family of bounded 
kernels (h c ) c satisfying (A2) and (A4) uniformly and the sequence of indices (J(K))k£N 
in (2.6) with J{K) — >k^oo o° can be chosen such that 

lim limsuplimsupsupE(J7„ — U^ K C ' L ^) = 0. 

c->ao K^oo L^oo neN 

This auxiliary result implies the analogue of Theorem 2.1 for non-Lipschitz kernels. 
Theorem 2.2. Assume that (Al), (A2), and (A4) are satisfied. Then, as n—toc, 

nU n —> Z, 

where Z is defined as in Theorem 2.1. If additionally W\h(X\,X{)\ < oo, then 

nV n -^ Z + Eh{X 1 ,X 1 ). 

3. Consistency of general bootstrap methods 

As we have seen in the previous section, the limit distributions of degenerate U- and 
^-statistics have a rather complicated structure. Therefore, in the majority of cases it is 
quite difficult to determine quantiles, which arc required in order to derive asymptotic 
critical values of U- and V-type test statistics. The bootstrap offers a suitable way of 
approximating these quantities. 

Given X\, . . . ,X n , let X* and Y* denote vectors of bootstrap random variables with 
values in M. dl and M. d2 . In order to describe the dependence structure of the bootstrap 
sample, we introduce, in analogy to Definition 2.1, 

T*(Y*,X*,x n ):=E[ sup 

\/eAi(K d i) 

provided that E(||A*|| (l |X„ = x n ) < oo with X„ := (X[,. . . , X' n )' . We make the following 
assumptions: 

(Al*) (i) The sequence of bootstrap variables is stationary with probability tending 

to one. Additionally, (A*', A*')' -A (X' ti , X[J ,\/t u t 2 G N, holds true in 
probability. 

(ii) Conditionally on X\,...,X n , the random variables (A^)fc £ z are r-wcakly 
dependent, that is, there exist a sequence of coefficients (f r ) r6 N with 
Yl^Li r {^r) 5 < oo for some S G (0, 1), a constant C\ < oo, and a sequence of 
sets (3En )neN with P(X„ € 3£„') — > n ^oo 1 and the following property: For 



f(x)P x * lY *(dx) 



f(x)P x .(dx) 



|X n — x n 
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any sequence (x n )„ eN with x„ € X„ ,n E N, sup fegN E(||X^|| Zl |X„ = x n ) < C\ 
and 

r*{x n ) := sup{r*((A% . . .,X* S 'J, (X*', X", X")', x n )\ 

ueN,s 1 <---<s u <s u + r<t 1 <t 2 <t 3 eN} 

can be bounded by f r for all r £ N. 



Remark 2. 

(i) Neumann and Paparoditis [26] proved that in case of stationary Markov chains 
of finite order, the key for convergence of the finite-dimensional distributions is 
convergence of the conditional distributions, cf. their Lemma 4.2. In particular, 
they showed that AR(p) bootstrap and ARCH(p) bootstrap yield samples that 
satisfy (Al*)(i). 

(ii) In Section 4.2, we present another example that satisfies (Al*), namely a residual- 
based bootstrap procedure for a Lipschitz contracting nonlinear AR(1) process, 
given by X t = g(Xt-i) +£*. In particular, note that the bootstrap process there 
cannot be proved to be mixing according to the discreteness of the bootstrap in- 
novations that are generated via Efron's bootstrap from the empirical distribution 
of the recentered residuals of the original process. 

Lemma 3.1. Suppose that (Al) and (Al*) hold true. Further let h:R d x R d -» R be 
a bounded, symmetric, Lipschitz continuous function such that Wi{X\,y) = E(h(X^,y)\ 
Ai,...,A„) = 0,VyeK d . Then, 

1 n 1 n 

~Y^^h(X*,X* k )^ Z and - ^ h{X* ,X£) Z + Eh(X 1 ,X 1 ) 

3=1 k^j j,k=l 

hold in probability as n — > oo. Here, Z is defined as in Theorem 2.1. 

In order to deduce bootstrap consistency, additionally, convergence in a certain metric p 
is required, that is, 

P ( P (^ £ Kx*,x* k )<x\x l ,... 1 x}j,p(^- h{x h x k )<^ Ao. 

(Here, — > denotes convergence in probability.) Convergence in the uniform metric fol- 
lows from Lemma 3.1 if the limit distribution has a continuous cumulative distribution 
function. The next assertion gives a necessary and sufficient condition for this. 

Lemma 3.2. The limit variable Z, derived in Theorem 2.1 /Theorem 2.2 under (Al), 
(A2), and (A3)/(A4), has a continuous cumulative distribution function i/var(Z) > 0. 



Kernels of statistics emerging from goodness-of-fit tests for composite hypotheses often 
depend on an unknown parameter. We establish bootstrap consistency for this setting, 
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that is, when parameters have to be estimated. Moreover, the class of feasible kernels is 
enlarged. For this purpose, we additionally assume 

(A2*) (i) e„ Aeeect!'. 

(ii) E(/ 1 (X 1 *, 2 /,? n )|X„)-0,VyeR d . 

(hi) For some S satisfying (Al*)(ii), v > (2 — S)/(l — S), and a constant C2 < 00, 
there exists a sequence of sets (Xn^)nen such that P(X„ g X^) — > n ->-oo 1 
and V(a; n ) n6 N with a;„ £ X„ the following moment constraint holds true: 

sup E(\h(X*,X* +k X)\ v + \h(Xl Xie n )\ v \X n = x n ) < C 2 , 

l<k<n 

where (conditionally on X n ) X£ denotes an independent copy of X£. 
(A3*) (i) The kernel is continuous in its third argument in some neighbourhood 
U(9) C O of # and satisfies 

\h(x,y,6 n ) - h(x,y,8 n )\ < f(x,x,y,y,9 n )[\\x-x\\ h + \\y - y\\h) 

for all x, x,y,y€ R d , where / : R 4d x RP K is continuous on R 4d x U{6). 
Moreover, for 77 := 1/(1 — 8) and some constants A > 0,6*3 < 00 there 
exists a sequence of sets (3£n )«eN such that P(X„ <G Xn^) — ►n-nxi 1 
and V(a; n ) I igN with x n £ X„ the following moment constraint holds 
true: 



max [f(Y*,Yl + a 1 ,Y 3 *,Y£ + a 2 ,9 n )] r >\\Y 5 *\\ h \x n = x n <c 3 

ai,a2e[-A,A] d / 

for all Yf,..., Y 5 * with F fc * = , ft € { 1 , . . . , 5} (conditionally on X 1 , . . . , X n ) . 

(ii) EZiHfrf <oo. 

Under these assumptions a result concerning the asymptotic distributions of nil* = n^ 1 x 

Y^j=iHk^jK x h X k^n) and nV* = n -1 J2],k=i HXj> 6 n ) can be derived. To this 
end, we denote the U- and V-statistics with kernel h(-,-,0) and arguments Xi, . . . ,X n 
by U n and V n , respectively. 

Theorem 3.1. Suppose that the conditions (Al), (A2), and (A4) as well as (Al*), 
(A2*), and (A3*) are fu{ 



(i) As n—t 00, 



nil* — — > Z, m probability, 



where Z is defined as in Theorem 2.1. If furthermore var(Z) > 0, then 
sup \P(nU* < x\X u . . .,X n ) - P(nU n <i)|Ao, 

— oo<a:<oo 
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(ii) If additionally E\h(X 1 ,X u 9)\ < oo andE(\h{Xf,XZ,6 n )\\X n ) -Ae|/i(Xi, 
then as n — > oo , 

nV* — ^> Z + E/i(Ai,Xi,#), in probability. 
Moreover, in case ofvai(Z) > 0, 

sup |P(nV;* < x\X l7 . . .,X n ) - P(nV n < x)\ -A 0. 

— oo<a:<oo 

Remark 3. Theorem 3.1 implies that bootstrap-based tests of U- or V^-type have 
asymptotically a prescribed size a, that is, P(nU n > t* u a ) — a and P{nV n > 
t* a ) — > n ->-oo ct, where t* ua and t * a denote the (1 — a)-quantiles of nil* and nV*, 
respectively, given X\, . . . , X n . 

4. Z 2 -tests for weakly dependent observations 

This section is dedicated to two applications in the field of hypothesis testing. For sake 
of simplicity, we restrict ourselves to real- valued random variables and consider simple 
null hypotheses only. The test for symmetry as well as the model-specification test can 
be extended to problems with composite hypotheses, cf. Leucht [23, 24]. 

4.1. A test for symmetry 

Answering the question whether a distribution is symmetric or not is interesting for 
several reasons. Often robust estimators of and robust tests for location parameters as- 
sume the observations to arise from a symmetric distribution, see, for example, Staudtc 
and Sheather [28]. Consequently, it is important to check this assumption before apply- 
ing those methods. Moreover, symmetry plays a central role in analyzing and modeling 
real-life phenomena. For instance, it is often presumed that an observed process can be 
described by an AR(p) process with Gaussian innovations which in turn implies a Gaus- 
sian marginal distribution. Rejecting the hypothesis of symmetry contradicts this type of 
marginal distribution. Furthermore, this result of the test excludes any kind of symmetric 
innovations in that context. 

Suppose that we observe X\ , . . . , X n from a sequence of real- valued random variables 
with common distribution Px and satisfying (Al). For some \i € K., we are given the 
problem 

H : Px-fi = Pn-x vs. Hi: Px-n'fiPn-X- 

Similar to Feuerverger and Mureika [18], who studied the problem for i.i.d. random 
variables, we propose the following test statistic: 

S n =n [ [3(c„(t)c- i ' it )] 2 w(t) dt = - V f sin(t(X,- - ft)) sm(t(X k - fi))w(t) dt 
Jm n j,k=i jM 
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which makes use of the fact that symmetry of a distribution is equivalent to a vanishing 
imaginary part of the associated characteristic function. Here, denotes the imagi- 

nary part of z € C, c„ denotes the empirical characteristic function and w is some pos- 
itive measurable weight function with / R (l + |i|)u>(i) dt < oo. Obviously, S n is a V-type 
statistic whose kernel satisfies (A2) and (A3). Thus, its limit distribution can be deter- 
mined by Theorem 2.1. Assuming that the observations come from a stationary AR(p) 
or ARCH(p) process, the validity of (Al*) is assured when the AR(p) or ARCH(p) boot- 
strap methods given by Neumann and Paparoditis [26] are used in order to generate the 
bootstrap counterpart of the sample. Hence, in these cases the prerequisites of Lemma 3.1 
are satisfied excluding degeneracy. Inspired by Dehling and Mikosch [10], who discussed 
this problem for Efron's Bootstrap in the i.i.d. case, we propose a bootstrap statistic 
with the kernel 

h*(x,y) = h(x,y)- f h{x,y)P* n {Ax) - f h(x, y)P* n {Ay) + f h{x,y)P*{&x)P*{dy). 

JR JR JR 2 

Here, h denotes the kernel function of S n and P* the distribution of X* conditionally on 
X\, . . . , X n . Similar to the proof of Theorem 3.1, the desired convergence property of S* 
can be verified. 



4.2. A model-specification test 

Let Xq , . . . , X n be observations resulting from a stationary real- valued nonlinear autore- 
gressive process with centered i.i.d. innovations (efc)fcez, that is, Xk = g(Xk-i) +Sk- Sup- 
pose that E|eo| 4+<5 < oo for some S > and that g£G:={f :R— Lipschitz continuous 
with Lip(/) < 1}. Thus, the process (Xk)ke% is T-depcndcnt with exponential rate, see 
Dedecker and Prieur [9], Example 4.2. We will present a test for the problem 

H : P(E(X 1 \X )=g {X )) = l vs. Hr. P(E(Xi\X ) = g (X )) < 1 

with go £ G. For sake of simplicity, we stick to these small classes of functions G and of 
processes (Xf~)k£_z- An extension to a more comprehensive variety of model-specification 
tests is investigated in a forthcoming paper, cf. Leucht [24]. 

Similar to Fan and Li [16], we propose the following test statistic: 



1 

Tn = —=Y,Y,^-9^ X 3-l)){X 
1 " 



k-g (X k ^))K 

3=1 k=^j 



Xj-i — Xk-i 



n ■ 

3=1 k^j 



that is, a kernel estimator (multiplied with nVh) of E([Xl — g(Xo)]E(Xi — g(Xo)\ 
Xq)p(Xq)) that is equal to zero under T-Lq. Here, Zj. := (Xk, X^-i)', k G Z, and p denotes 
the density of the distribution of Xq. Fan and Li [16], who considered j3- mixing pro- 
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cesses, used a similar test statistic with a vanishing bandwidth. In contrast, we consider 
the case of a fixed bandwidth. These tests are more powerful against Pitman alternatives 
9i,n{x) = go{x) +n~^w(x) +o(n~P) 7 (3 >0,m6G. For a detailed discussion of this topic, 
see Fan and Li [17]. 

Obviously, T n is degenerate under "Ho- If we assume if to be a bounded, even, and 
Lipschitz continuous function, then there exists a function /:M 8 — > R with \H{z\,z 2 ) — 
H(z\,z 2 )\ < f(zi,z 1 ,z 2 ,z 2 )(\\z 1 - zi\\ h + \\z 2 - z 2 \\ h ) and such that (A4) is valid. More- 
over, under these conditions H satisfies (A2). Hence, the assertion of Theorem 2.2 
holds true. In order to determine critical values of the test, we propose the boot- 
strap procedure given by Franke and Wendel [19] (without estimating the regression 
function). The bootstrap innovations (e^)* arc drawn with replacement from the set 
{it —et—n^ 1 Ylk=i where e t = Xt — go(Xt-i),t = 1, . . . , n. After choosing a start- 

ing value Xq independently of (s%)t>i, the bootstrap sample X£ = g(X*_ 1 ) +Ef as well as 
the bootstrap counterpart T* = rT 1 Y?j=i J2k=£j H{Z*,Zl) of the test statistic with = 
(X%,X%_i)', k = l,...,n, can be computed. In contrast to the previous subsection, the 
proposed bootstrap method leads to a degenerate kernel function. Obviously, the boot- 
strap sample is r-dependent in the sense of (Al*) and satisfies E(|X^||Zi, . . . ,Z n ) < C 
for some C < oo with probability tending to one. Theorem 1 of Diaconis and Freed- 
man [13] yields the existence of a stationary solution to A 4 * = g(Xf_ 1 ) + e\ and that the 
distribution of any "reasonably" started process converges to the stationary one with 
exponential rate. In order to apply our theory, Aq is assumed to be drawn from the 
stationary bootstrap distribution, conditionally on Xi, . . . ,X n . We employ Lemma 4.2 
of Neumann and Paparoditis [26] to verify convergence of the finite dimensional distri- 
butions. The application of this result requires the convergence of the conditional dis- 
tributions, that is, s\ip xeK d(P x *\ x *-i =x ,P x ^ Xt - 1=x ) -A for every compact K C K 
and d(P, Q) = infx~p,y~Q K(\X — Y\ A 1). In the present context, this can be confirmed 
similarly to the proof of Lemma 4.1 by Neumann and Paparoditis [26] if the innovations 
of the original process have a bounded density. Summing up, all prerequisites of Theo- 
rem 3.1 are satisfied. Hence, critical values of the above test can be determined using the 
proposed model-based bootstrap procedure. 

5. Proofs 

5.1. Proofs of the main theorems 

Throughout this section, C denotes a positive finite generic constant. 

Proof of Theorem 2.1. First, we derive the limit distribution of nUi K c ' , defined before 
Lemma 2.2. Afterwards, the asymptotic distributions of nU n and nV n are deduced by 
means of Lemma 2.1, Lemma 2.2, and a weak law of large numbers. 

The following modified representation of hi K ' L ^ will be useful in the sequel: 

M(K,L) 

hi K ' L \x,y)= £ ^Jq^xMy), 
k,l=l 
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where (gi)^i is an ordering of U fee {-L....,L}4{*j,fc} u {^ k }eeE,j&{o,...,j(K)-i}} and 
7^ =7^\fc,/g {1, . . . , M(K, L)}, are the associated coefficients. Moreover, the intro- 
duction of q k (Xi) := q k (Xi) - ~Eq k (Xi),k e {1, . . . , M (K, L)}, i e {l,...,n}, allows for the 
compact notation of nUnf c ■ , 



K.L) 



M(K,L) 



1 n 1 n 1 n 



The latter summand in the round brackets converges to — ¥,q k (Xi)qi(Xi) in probability by 
virtue of Lemma 5.1. In order to derive the limit distributions of the first summands, we 
consider n -1 / 2 ^2™ = i(qi(Xi), . . . , qM(K ,L)(Xi))'. Due to the Cramer-Wold device, it suf- 



fices to investigate J2 k =i ' 1/2 S"=i 9k(Xi), V(*i, . 



) ^M(K,L) 



vM{K,L) 



Asymp- 



totic normality can be established by applying the CLT of Neumann and Paparoditis [26] 
to := X}fcl=i t k qk{Xi), i = 1, . . . ,n. To this end, the prerequisites of this tool have to 
be checked. Obviously, we are given a strictly stationary sequence of centered bounded 
random variables. This implies in conjunction with the dominated convergence theorem 
that the Lindeberg condition is fulfilled. In order to show 



- var(Qi 
n 



a 2 := var(Qi) + 2 V cov(Qi, Q k ) 



k=2 



the validity of (Al) can be employed which moreover assures the existence of the limit a 2 . 
Then, 



- var(Qi 
n 



Qn) - <? 



-£(»-[r-l])oOT(Qi,Q r )-2^cav(Qi,Q* 



k =2 



< 2Vm 



r=2 



r-1 



,l\\cov(Q u Q r )\ 



<4||g 1 !| oc Lip(Q 1 )^min 



r-1 



, 1 fTr-l, 



where the latter inequality follows from (2.4). The summability condition of the depen- 
dence coefficients in connection with Lcbcsgue's dominated convergence theorem yields 
the desired result. Since Qt 1 Qt 2 forms a Lipschitz continuous function, inequality (6.4) 
of Neumann and Paparoditis [26] holds true with r = Lip((5 fl <5 f2 )r r . It is easy to con- 
vince oneself that their condition (6.3) is not needed if the involved random variables are 
uniformly bounded. Finally, we obtain 



n~ 1/2 (Qi + -- . + Q„) JV(0, a 2 ) 
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and hence, 

'■= zZ a kiM [ZlaZkz -A, 



k 1 ,k 2 <£{-L,...,L} d 
J(K)-1 

+ Z^ Z^ _ p o\kiM^]-M Zj j;k 2 D rM,k 2 \- 

3=0 fci,fc 2 e{-£,...,£} ti e=(e£,e£)'e.E 

Here, {Z k ) ke{ - Li ... iL} d and (^) J - G { ,...,j(^)-i}, e e{o,i}'*,fce{-L > ...,L}<i ) respectively, are 
centered and jointly normally distributed random variables. 
By Lemma 2.1 and Lemma 2.2, we have 

lim limsuplimsupsupn 2 E(L^' i) - U n f = 0. 

Q . tt(K-L) d ry(K,L) ., . , , 

bmce nUn,c — , it remains to snow 

lim limsuplimsupE(Z^' i) - Z) 2 = (5.1) 

in order to prove that nll n — — > Z due to Billingsley [4], Theorem 4.2. To this end, we first 
show that (Zc K ' L ^)l is a Cauchy sequence in Li. Note that n(E/^' — Un K c L2 ^) — ■> 
^(if.ii) _ According to Theorem 5.3 of Billingsley [4], we obtain E(zf CLl) - 

2(*.ia))a < liniinf,^^ n 2 E(U r { K c Ll) - U ( n K c L2) ) 2 . The r.h.s. converges to zero as L lt L 2 -> 
oo by virtue of (5.10) in the proof of Lemma 2.2. Denoting the corresponding limit 
by Zc similar arguments yield 



E(Z^ - Z ( c K ^f < 41imsupE(Z,F 1 < L ) - Zf^f 



L- 



(K lt L) _ rr(K 2 ,L)^2 



< 4 lim sup lim inf n^MCU^. 1 '^ - U n c 

< 161iminfn 2 E(C/^^ -U^f — -> 

n— >ao ' Ki,K2— >oo 

according to (5.9) of the proof of Lemma 2.2. In view of Lemma 2.1, we obtain (5.1) by 
applying the above method once again. This in turn leads to the desired limit distribution 
of nU n . 

Based on the result concerning [/-type statistics, the limit distribution of nV n 
can be established. Since V n = U n + nr" 1 J3fe=i h(Xk,Xk), it remains to verify that 

•nT 1 yifr—t h(Xk,Xk) — ^> Hh(Xi,Xi). This in turn is a consequence of Lemma 5.1. □ 

Proof of Theorem 2.2. On the basis of Lemma 2.3 similar arguments as in the proof 
of Theorem 2.1 yield nU n — > Z. Moreover, Lemma 5.1 implies n -1 J2k=i h{Xu,Xk) ^> 
EhiX^Xi). Thus, nV n -A Z + Eh{X 1 ,X 1 ). □ 
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Proof of Theorem 3.1. Due to Lemma 3.2, it suffices to verify distributional conver- 
gence. To this end, we introduce 

x e n c n x^ n x& n {x n \\\e n - e\\ h < s n } 

such that 

£((**', . . . , X%)'\X n = x n ) = C{{X? i+l , X* t ' k+l )'\X n = x n ), (5.2) 

c((x;;,x;;y\x n = Xn ) => c{{x' tl ,x' t2 )') (5.3) 

uniformly for any sequence (x„)„ 6 n with x n £ X n and ii, . . . , tk, k, I £ N. Moreover, the 
null sequence (<5 n )neN can be chosen such that on X B n , 9 n £ U{9) and P(X n £ X e n ) — > n ^>oo 
1 hold. Hence, to prove nil* —> Z, in probability, it suffices to verify that nil* converges 
to Z in distribution conditionally on X„ = x n for any sequence (x n ) n with x n £ X e n . Now, 
we take an arbitrary sequence (x n ) n with x n £ X 9 nl n £ N. 

In order to show that it suffices to investigate statistics with bounded kernels, we 
consider the degenerate version h* of 

!h(x,y,6 n ) for \h(x,y,9„)\ <c h (9 n ), 
-c h (6 n ) for h(x 7 y,9 n ) < ~c h (0 n ), 

c h (9 n ) for h(x,y,8 n ) >c h (6 n ) 

with c h (9 n ) := m&x xye[ _ cc]d \h(x,y,9 n )\ < m&x^,^^^^^^ \h(x,y,9)\ < oo. The 
associated [/-statistics are denoted by U* c . Now, imitating the proof of Lemma 2.1 
results in 

limsu P 7i 2 E[([/ r t - C/* J 2 |X„ = x n ] — > 0. 

Within the calculations, the relation \imsup n ^ 00 P(X^ ^ (— c, c) d |X„ = x n ) < P{X\ ^ 
(— c, c) d ) — 5~c->oo has to be invoked which follows from Portmanteau's theorem in con- 
junction with (5.3). Next, we approximate the bounded kernel by the degenerate version 
of 

K iK ' L) --= E *.**o*w J xf E EfeWl^ 

k u k 2 e{-L,...,L} d j=0 k 1 .k 2 e{-L....,L} d eeE 

where Sg fc2 = ff** x1L dKfaVi0n)&oM x )®o,k a (v)te4v and $g$ M = ff R d xU * K( x >V> 
@n)^j ki k 2 ( x i y) ^ X Denoting the associated [/-statistic by leads to 

lim limsuplimsupn 2 E[([/* c - l> ( f < L) ) 2 |X„ = x n ] = 

K->co i^oo n-¥oo 

which can be proved by following the lines of the proof of Lemma 2.3. Here, J(K) is 
chosen as follows: We first select some b = b(K) < oo such that P(Xi £ (—b 1 b) d ) < 1/K. 
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Afterwards, we choose J(K) such that max x ye ^_ b b ^d \h c (x,y,6) — he {x,y,6)\ < l/K 
and < A, where S,/, denotes the length of the support of the scale func- 

tion cf>. The index J(K) can be determined independently of n on (X^) n since 
max^e^^jd \h* c (x,y,9 n )-h c (x,y,9)\ — > and max x> y & [_ bib y 

@n)\ — ^ 0, as n — > oo, due to the continuity assumptions on /. Here, hc KJX> is defined by 
the substitution of ^2 kl k2£ ^_ L L y through Y] kl k 2 ez, d m * nc definition of h, 
note that 



\hi K \x,y,6)-h* (K \x,y, 

(K) 



■■(K,L) 



Also 



-.(c) 



„( c ) 



R d xl 



o( c . e ) , fl(c,e) 

P rMM ^o P rMM ■ 



R d xl 



h c {x, y, 0)$om O)$o,fc 2 (y) dx dy, 



h c (x, y, 0)^ klM {x, y) dxdy 



TJ *(KX)^2 
V n.c 



on (£„)„. Hence, lim^oo n 2 E[(U n ) 

of Un[^' L ^ is obtained by substituting fe2 and Pj C j^k 2 
through a^ ifc2 and y8j!fc^ 2 , respectively. 



:„] = 0, where the kernel 
in the kernel of 



(K,L) 



Thus, the next step is the application of the CLT of Neumann and Paparoditis [26] 



to nil, 



:{K,L) 



For this purpose, we introduce Q* := Y^k=i tkQk(X*),ti, • ■ • > ^m(k,l) <= K, 
where gj£ denotes the centered version (w.r.t. Px*\x n =x n ) of and is defined as in 

the proof of Theorem 2.1. Obviously, given Xi, . . . ,X n , the sequence is centered 

and has uniformly bounded second moments. Due to (Al*)(i), the Lindcbcrg condition 
is satisfied. In order to show that for arbitrary e > the inequalities |^ var(<5i + • • • + 
Q* |X„ = x n ) — <t 2 | < e, Vn > no(e), hold true with a 2 as in the proof of Theorem 2.1, the 
abbreviations var*(-) = var(-|X„ = x n ) and cov*(-) = cov(-|X„ = x n ) are used. Hence, 



-var*[Q* 
n 



< 2 min 

r=2 



< 2 min 

r=2 



coY*(QlQ* r )\ 
cov*(Q*,Q*.)| 



|var*(QJ)-var(Qi)|+2 



var*(Q*) 



2^T cov *(Q* 1: Q* r ) 

r=2 



R-l 

E 

r=2 



cov*(Q*,Q*) - cov(Qi,Q r )" 



cov*(Q*,Q* r ) 



r>R. 



r>R 



cov(Qi,Q r ) 



By (Al) and (Al*), R can be chosen such that | S r >fl cov (Qi; Qr)\ + 1 Z)r>i? C ° V *(Q*> Q*)| 
< e/4. Moreover, (Al*) implies that the first summand can be bounded from above 
by e/4 as well if n > rio(e) for some no(s) £ N. According to the convergence of the 
two-dimensional distributions and the uniform boundedness of (Q*,)fcez, it is possible 
to pick no(e) such that additionally the two remaining summands are bounded by e/8. 
For the validity of the CLT of Neumann and Paparoditis [26] in probability, it re- 
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mains to verify their inequality (6.4). By Lipschitz continuity of Ql Q* t this holds with 
6 r = Lip(Qj i <5j 2 )f r < Cf r . The application of the continuous mapping theorem results 

in nUn\c^ — — > Zc K , in probability. Invoking the same arguments as in the proof of 
Theorem 2.1, this implies nil* — — > Z, in probability. 

In order to obtain the analogous result of convergence for nV* , we define ^CI»,neN, 
such that \E(\h(XZ,XZ,6 n )\\X n = x n ) - E|/i(JTi,Xi,0)|| < r/ n yx n g X n . Here, the null 
sequence (7? ra )raGN is chosen in such a way that P(X n G XfJ — > n ->oo 1- Now, additionally 
to our previous considerations, 



P 



1 ™ 

- V h(X* ,X*,6 n )- Eh{X 1 , x 1 , 



i=l 



> e 







has to be proved for arbitrary e > and any sequence (cc„) nS N with x„ <G Xfj,n G N. 
According to the definition of the sets we get E(/i(X*,X*,0„)|X„ = x n ) — >n-»oo 

E/i(Xi , X\ , 6) . Therefore, it suffices to prove 



P 



1 " 

-V[/i(X fc *,X fe *,^)-E(/ l (X*,X 1 *,^)|X„ 

7") ^ * 



fc=l 



e 

> 2 



0. 



This in turn is a consequence of Lemma 5.1 since under the assumptions ofjfche theorem 
the sequence of functions (g n )neN with </")(•) = h(-,-,9 n ) — ¥,(h(X*, Xf, 8 n )\K n = x n ) 
is uniformly integrable and satisfies the smoothness property presumed in Lemma 5.1. 
Finally, bootstrap consistency follows from Lemma 3.2. □ 



5.2. Proofs of auxiliary results 

First, we derive a weak law of large numbers for smooth functions of triangular arrays 
of r-dependent random variables. 

Lemma 5.1 (Weak law of large numbers). Let (•Xr lj k)?_ 1 ,7i € N, be a triangu- 
lar scheme of (row-wise) stationary, M. d -valued, integrable random variables such that 
lim.K^. 00 swp n ^P(\\X rii i\\i 1 > K) =0. Suppose that the coefficients f r := sup Tl>r r r .„ sat- 
isfy f r — > r ->-oo 0, where 

r r ,„ := sup{t(<7(X„, Si , . . .,X„, S J, (X' nitl ,X' nM ,X' nit3 )')\u e N, 

1 < si < • • • < s u < s u + r < h < t 2 < h < n}. 

Moreover, suppose that the functions g<- n > :R d W with Eg^(X nA ) = p are uni- 
formly Lipschitz continuous on any bounded interval. If additionally the sequence 
(g( n > (-X"n,i))neN is uniformly integrable, then 

1 " 

T s W(i B| ,)Ao p . 

n 

fc=i 
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Proof. W.l.o.g. let p=l. We prove that for arbitrary e,r] > there exists an no such 
that for all n> no the inequality P(|n -1 Y^k=i 9 (X n ,k)\ > e) < V holds. To this end, 
a truncation argument is invoked. Let wk denote a Lipschitz continuous, nonnegative 
function that is bounded from above by one such that wk{x) = 1 for x G [— K, K] d and 
wk(x) = for x ^ [— K — 1, K + l] d with K G R + . For a finite constant M, that is specified 



later, define functions g^K '■ ^ d 

» 



by 



9m'k( x ) ■= { -M 
M 



g^{x)w K {x) for \g( n \x)w K (x)\ < M, 
for g^(x)w K (x) < -M, 
for gW(x)w K (x) >M 



and g^'^ by ^"'^(a;) =g < M > K( x ) _ %Ms;(^ti..i)- This allows for the estimation 



\ k=l J \ k=l 



k) - 9 ( M.K^. X n,k) 



> 



P[\Eg$ K (X n>1 )\>^) 



1 " 



k=l 



BUp%W(X nil )|l| 1 , { „) CJfBl) | >w + MsupP(||j: ni i||, 1 >JiO 
n€N neN 



According to Markov's inequality, the first summand on the r.h.s. can be bounded by 

■■i 

e 

Since the functions g^ n \n G N, are centered, we additionally obtain 

(ni 



P[\Eg^' K (X nA )\> 



<P[ su V E\g^ K (X n>1 )-gW(X ntl )\ > - 

VneN <3 



» , ? (n) 
< pfsu P E|. 9 (")(X„, 1 )|l |g( „ )(Xn i)|>M +MsupP(||X„, 1 ||, 1 > A') > | 

\neN n£N o 

Therefore, by choosing M and K — K(M) sufficiently large, we get 
' 1 ™ 

-J29 (n) (Xn,k)-9$ K (X n , 



P 



k=i 



» 



Concerning the remaining term, Chebyshev's inequality leads to 



1 " 

~y^,9M.K( X n,k) 



k=l 



>3^ 



9M 2 
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E E 5m (^".J )-9m'^ (^n,fc ) • 
3<k 



Thus, it remains to derive an upper bound for n 2 J2j<k \^9m '^(-^n,j)SAf'i*:P^n,fc)l that 
vanishes asymptotically. For this purpose, we introduce a copy X n ^ of X nt k, that is 
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independent of X n j and such that E||X raj fc — -X^^l^ < Tk-j, n - Due to their construction, 
the functions g^ K are Lipschitz continuous uniformly in n and with a constant C (M, K) . 
This implies 

j<k 
2MC(M,K) 



n'- 1 — ' ' ' n* 

j<k j<k 



< 



n 

r=l 



where the remaining term converges to zero according to Cauchy's limit theorem, cf. 
Knopp [22]. □ 

In order to prove Lemma 2.1, Lemma 2.2, and Lemma 2.3, an approximation of terms 
of the structure 



1 " 



n 

i,j,k,l=l 

is required. Here, H denotes a symmetric, degenerate kernel function. Assuming that 
(X n )„ e N satisfies (Al), we obtain 

Z„<A J2 |Eff(A^,X,)ff(X fc ,A ; )|<8 sup E\H(X u X 1+k )\ 2 + ^J2J2 Z nl 

i<j;k<l;i<k l<k<n r=1 (=1 

with 

l<i<j\k<l;j<l<n 
r:— min{j, — — maxl^A:} 

l<i<j\i<k\k<l<n 
r\— I — max{j,fc} >min{j,fc} — i 

Z<$:= Yl m(X, l ,X J )H(X k ,X l )-EH(X ll xf ) )H(xl r \xl r) )l 

l<i<k<l<j<n 
r: — k — i~>j—l 

Z$.:= J2 miX^X^HiX^X^-EHiX^X^HiX^X^. 

l<i<k<Kj<n 
r:—j~l>k—i 

Here, in every summand of Zn,l and Zn}r the vector (X^',X^', xj^')' is chosen such 
that it is independent of the random variable X h (xj r) ' ; X^' , x[ r) ')' = (X^X'^X'J , 
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and (2.3) holds. Within Zn}r (resp., Zn} r ), the random variable xj r ' (resp., -X^ ) is chosen 

to be independent of the vector (X'^X^X'J (resp., (X^, X' k , X^') such that x\ r) = X t 

(resp., = Xj) and (2.3) holds. This may possibly require an enlargement of the 
underlying probability space. Moreover, note that the subtrahends of these expressions 
vanish due to the degeneracy of H and that the number of summands of Zn\,t = 1, . . . , 4, 
is bounded by (r + l)n 2 . For sake of notational simplicity, the upper index r is omitted 
in the sequel. 

Proof of Lemma 2.1. For c> 0, we define Ch := max^ ye r_ c c id \h(x,y)\, 

)h(x,y) for \h(x,y)\ < c hi 
-c h for h(x,y) < -c h , 

c h for h(x,y)>Ch 

and its degenerate version 

h c (x,y):=h^(x,y)- [ h^(x,y)P x (dx) - [ h^(x,y)P x (dy) 

h^(x,y)P x (dx)P x (dy). 



The approximation error n 2 E(U n — J7„ iC ) 2 can be reformulated in terms of Z n with ker- 
nel H = :=h — hS c \ Hence, it remains to verify that sxrp k£N E\H(°> (Hi, Xi + k)\ 2 and 
sup„ eN n~ 2 53r=i St=i Zn,r tend to zero as c — > oo. First, we consider sup„ gN n~ 2 ^"=1 
Zn,l, the remaining quantities can be treated similarly. The summands of Zn,l are 
bounded as follows: 

| (X t , X, ) H ^ (X k , Xi ) - EffM (X, , Xj ) (X k , X, ) | 

^El^^fe^O^^,^)-^^,^^]!^^^)^!-^]-^ 

+ E|if( c )(X fc ,X i )[^ (c) (X i ,X J -)-if (c) (X i ,X i )]l ( x'x/)^[-c,c]^l 

~ ~ (5.4) 
+ E|^ c )(X J ,X J )[i?( c )(X fe ,X0-H (c ^X fe ,X0]l (x ,^ ) , e[ _ c , c]2£! | 

+ E\H^\X i ,X j )[H^\X k ,X l )-H^\X k ,X l )]t { ^ 

= Ei + E2 + £?3 + £4 • 

The functions i/( c ) arc obviously Lipschitz continuous uniformly in c. Therefore, an 
iterative application of Holder's inequality to E2 yields 

E 2 < (E|ff( c )(X i ,X i )-iT( c )(X i ,X j )|) 5 

y.(E\H^\X kl X l )\ 1/{1 ~ S \H^\x ll X ] )-H^\X l ,X J )\^^^^ 
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<Cr;{(E|^ c )(X fc> ^OI (2 " fl/(1 " 4) l(^ 1 ^)' t S[-c 1 c] a ««) 1/(2 ~ S) (5-5) 

As sup fcgN E|/i(Ai, Ai + / C )| l/ < oo for v > (2 — <5)/(l — S), we obtain E2 < r*ei(c) with 
£i(c) — >c-+<x> after employing Holder's inequality once again. Analogous calculations 
yield E4 < 62(c) with 62(c) — s-c-^oo 0. Likewise, the approximation methods for E\ 
and Ej, arc equal. Therefore, only E\ is considered: 



E 1 < 



E 



E 



h^(X k ,y)P x (dy)[H^(X. l ,X ] ) - H i - c \X il Xj)]t Xk 



h ( - c \y,X l )P x (dy)[H^(X i ,X j ) - iJ (c >(A t , A,)]l x , e[ _ CjC]£i 
^(x,y)Px(da;)P A -(ckj)[ J ff( c )(A 4 , A,) - ^(X;, A^)] 



— + 2 + -fio.,3- 
Analogous to (5.5), wc obtain 



E hl < Cr 5 r 



h(X k ,y)-h^(X k ,y)P x (dy) 



(2-5)7(1-5) 



1 



supE| J ff( c )(Ai,Ai +fe )| (2 ^ )/{1 ^ ) +E|ii-( c )(A 4 ,A,)| (2 ^ )/(1 ^ ) 



1/(2-5) 



(l_5)/(2-5) 



1-5 



feeN 

[ \h(x,y)-h^(x,y)\ {2 - S)/{1 - S) 

, (1-5) 

x P x (dy)± x£[ _ C)C] <iP x (dx) 
<rf.e z {c) 

with 63(c) — > c ^oo 0. The estimation of E\^ coincides with the previous one. The ex- 
pression £13 can be bounded as follows: 



£1,3 < Ct t 



\h(x,y) - h^(x,y)\P x (dx)P x (dy) 



<Cr r // \h(x,y)\l( x , iy ,yg[_ CtC ydP x (dx)P x (dy) 

J JTS. d xR d 
< T r 6 4 (c) 
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with 64(c) — > c ->oo 0. To sum up, we have E\ + E 2 + E3 + E 4 < £ 5 (c)t/, where 
£5(0) — ^c-s-oo uniformly in n. This leads to 

lim sup — Z£l < lim sup — V"(r + l)n 2 r*e 5 (c) = 0. 



r— 1 r— 1 



It remains to examine 



BupE[ffW(Xi,Xi +fc )] 2 < cfsupE^CX^Xi+fc) - h^{X u Xr + k)Y 

[/i(X 1 ,X 1 )-^ c )(X 1 ,X 1 )] 2 



Here, Xi denotes an independent copy of X\. Similar arguments as before yield 
lim e _ >oo sup fceN E[fl"( c )(Jr 1 ,A'i +fc )] 2 = 0. ' □ 

The characteristics stated in the following two lemmas will be essential for a wavelet 
approximation of the kernel function h. 

Lemma 5.2. Given a Lipschitz continuous function g:R d — >R, define a wavelet series 
approximation gj by g 3 (x) := J2kez d n i'^'i ' ■ ■''!■ J e Z > where a j:k = J Rd g(x)§j, k (x) dx. 
Then g.j is Lipschitz continuous with a constant that is independent of j . 

Proof. In order to establish Lipschitz continuity, the function gj is decomposed into two 
parts 



^ X )=H / *j,k(u)ffOc)du $j,k( x ) + / ®iA u )l9(u)-9(x)]du 



9j( 



Hx(x)+H 2 (x) 



According to the above choice of the scale function (with characteristics (l)-(3) of Sec- 
tion 2.2), the prerequisites of Corollary 8.1 of Hardle et al. [20] are fulfilled for N = 1. This 
implies that Z^2iez 4>{d ~ O^C 2 — l)dz= 1, € ffi. Based on this result, we obtain 



£/ *i,*(«)*i,k(z)d« 2 \[ / E°'- 



J Ui - l)(j)(2 j x l - duj = 1 Vx G M d , 

by applying an appropriate variable substitution. To this end, note that for every fixed x, 
the number of non-vanishing summands can be bounded by a finite constant uniformly 
in j because of the finite support of </>. Therefore, the order of summation and integra- 
tion is interchangeable. Hence, H\ = g which in turn immediately implies the desired 
continuity property for H\ . 

In order to investigate H2, we define a sequence of functions (ftfe)feez by 

n k (x)= <& hk (u)[g{u) -g(x)]du. 
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These functions are Lipschitz continuous with a constant decreasing in j: 

\K k (x) - K k (x)\ < Lip(g)0(2-^ 2 )\\x - x\\ h . (5.6) 
Moreover, boundcdncss and Lipschitz continuity of <p yield 

||^fc||oo = 0(2^/ 2 ) and \§ hk {x)-$ ] . k {x)\=0{yW 2 +^)\\x~x\\ ll . (5.7) 

Thus, 

\H 2 (x) - H 2 (x)\ < \*J,k(z)\\*k(x) - K k (x)\ 

+ \ K k{x)\\9 jlk (x) - 9 jtk (x)\ 

<C||x-x|| Zl + M5)ll*3,*0«0 -*j,fc(S)l- 

kez d 

Now, it has to be distinguished whether or not x G supp($j.fc) in order to approximate the 
second summand. (Here, supp denotes the support of a function.) In the first case, it is 
helpful to illuminate |refc(x)| = \ f Rd <S>j ik (u)[g(u) — g(x)]du\. The integrand is non-trivial 
only if u G supp($jjt). In these situations, \g{u) — g(x)\ = 0(2 _: ') by Lipschitz continuity. 
Consequently, we get 

M*)| < 0(2-') / \$ jtk (u)\du = 0(2-iW 2+ V) 

which leads to 

2 \Kk(x)\\*i,k(*) ~ <M*)I <C\\x- x\\ h 

as the number of nonvanishing summands is finite, independently of the values of x and x. 
Therefore, Lipschitz continuity of H2 is obtained as long as x G supp($j.fc). 

In the opposite case, we only have to consider the situation of x G supp($j.fe) since the 
setting x, x supp(<&^fc) is trivial. With the aid of (5.6) and (5.7), the first term of the 
r.h.s. of 

\Kk(x)[®j,k( x ) ~ ®jAx)}\ < " Kk(x)\\*j,k( x )\ + \ K k(x)\\®j, k{x) - (5.8) 

can be estimated from above by C\\x — x\\i t - The investigation of the second summand 
is identical to the analysis of the case x G supp($j j fc). 

Finally, we obtain \H%(x) — H.2{x)\ < C\\x — x\\i lt where C < 00 is a constant that is 
independent of j . This yields the assertion of the lemma. □ 

Lemma 5.3. Let g:Wi d — > M be a function that is continuous on some interval (—c,c) d . 
For arbitrary b G (0, c) and K G N there exists a J{K, b, c) G N such that for g and its 
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approximation gj given by gj{x) = X)fcez d &j,k&j,k(x) it holds 



max \g(x)-gj(x)\<l/K VJ>J(K,b,c). 

xE[-b,b] d 

Proof. Given b £ (0,c). we define g( b,c '(x) := g{x)wb, c (x), where Wb. c is a Lipschitz con- 
tinuous and nonnegativc weight function with compact support S w C (— c, c) d . More- 
over, Wb, c is assumed to be bounded from above by 1 and Wb lC (x) := 1 for .t € (— b — 8, 
b + S) d for some 5 > with fe + (5< c. Additionally, we set otj k '■— J-ud g^ b,c \u)<^ j t k(u) du. 
Hence, 



xG[-b,b] d 



< max 

x£[-b,b] d 



max A^fx) 

xe[-b,b] d 



kez d 

max E.( J \x). 

x£[-b.b] d 



max 



Since <?( 6 > c ) g Co(K d ), Theorem 8.4 of Wojtaszczyk [29] implies that there exists 
a Jo(K,b,c) € N such that max 2 . e [_ i , i6 ]d ^4^ J ^(x) < l/if for all J > Jo(K,b,c). Moreover, 
the introduction of the finite set of indices 



Z(J) := {k e Z d \$, lk {x) ^ for some x € [-6, b] d } 



leads to 



max B <i ' , \x)~ max 

a;e[-fc,6] d x£[-b,b] d 



(uj, k -a { j; k c) )<5>j, k {x) . 

kez(j) 

This term is equal to zero for all J > J(K, b, c) and some J(K, b, c) > Jq(K, b, c) since the 



definition of <?( fe > c ) implies aj >k = <Xj^ , Vfc S Z, for all sufficiently large J. 



□ 



Proof of Lemma 2.2. The assertion of the lemma is verified in two steps. First, the 
bounded kernel h c , constructed in the proof of Lemma 2.1, is approximated by hi K ^ which 
is defined by h { c K \x, y) = T, kl ,k 2 ez d a 7(K)-M ,fc a $ J(*),fci ( X )®J(K)M (v) with af {K) . MM = 
ff Mdxmd h c (x,y)$j( K)M (x)$j( K)M (y)dxdy. Here, the indices (J(K)) KeN with J(K) 
— >k^oo oo are chosen such that the assertion of Lemma 5.3 holds true for b = b{K) € R 
with P{X\ ^ [— b,b] d ) < K" 1 and c = 26. Since the function he is not degenerate in 
general, we introduce its degenerate counterpart 



(x,y) = h c K \x,y)- hf > (x, y)P x (dx) - / h™ (x, y)P x (dy) 

JR d JR d 

h[ K \x,y)P x {dx)P x {dy) 



and denote the corresponding {/-statistic by U, 



(K) 
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Now, the structure of the proof is as follows. First, we prove 



supn 2 E{U n , c ~Ui K Jf — > 0. (5.9) 



K — >oo 



n£N 

In a second step, it remains to show that for every fixed K 

supn 2 E([/W - Ui K > L) ) 2 r — ► 0. (5.10) 



L- 



In order to verify (5.9), we rewrite n 2 E([/„ jC — U^c) 2 in terms of Z n with kernel function 

H : = ffW = h c — he ■ Hence, it remains to verify that sup ngN n -2 J2r=i St=i %n,r 
and sup feeN E|fl"( K )(ifi,Xi +fe )| 2 tend to zero as K — >• oo. Excmplarily, we investigate 
sup neN n~ 2 X)"=i ^nyi"- The summands of can be bounded as follows: 

\EH^ K ){X i ,X j )H^ K \X k ,X l )-H^ K \X i ,X j )H^ K \X k ,X l )\ 
< nH (K) (Xt , X|) [H W (X, , X 3 ) - W (X, , Xj)} | 
+ E|ff {K) {X U X, ) {Xk ,Xi)- { x k , x t )} \. 

Since further approximations arc similar for both summands, we concentrate on the first 
one. Note that boundedness of h c implies uniform boundedness of (H( k ')k due to the 
compact support of the function <f>. Moreover, the constant Lip(iJ^) does not depend 
on K in consequence of Lemma 5.2. Therefore, the application of Holder's inequality 
leads to 

E|JT<*>(X fc ,X0[# ( *H*i,*i)-# ( *^ 
The construction of the sequence {o{K))k above allows for the following estimation: 

E|irW(x fc> *,)l 1/(1 " 4) 

= E\H^(X k ,X l )\ 1/ ^- 5 h Xk , Xie[ _ b{K)MK)]d +0(P(X 1 i {-b(K),b(K)] d )) 
< sup \ H W( Xi y)\W-*) + ° 

x,y£[~b(K),b(K)] d K 

According to Lemma 5.3 and the above choice of the sequence (b(K))ic , we obtain 
sup \H^ K \x,y)\ 

x,y£[-b(K),b(K)] d 



<^ + 2 SU P E\h c {x,X 1 )-h c K \x,X 1 )\ 

K x,y£[~b(K),b(K)] d 

h c (x, y) - h{ K ) (x, y)P x (dx)P x (dy) 
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4 „ 



-, + 2 sup ^faXj-hWfaX^lx^wMKW 

x£[-b(K),b(K)] d 

2/ f \h c (x,y)-hi K \x,y)\P x (dx)P x (dy) 

Jm d Jm d \\-b(K).b!K)] d 



C 

< — . 
~ K 



Consequently, 

lEH^iX^X^H^^X^XO-EH^^^X^H^iX^X)] < Ce K r s r 

for some null sequence (sk)k- This implies that sup„ eN n~ 2 X)"=i Zn,l tends to zero 
as K increases. Furthermore, one obtains sup keN E[H^(Xi,Xi + k)] 2 = 0(if _1 ) simi- 
larly to the consideration of E\H^(X k ,X l )\ 1 ^ (1 - s '> above. Thus, we get sup„ n 2 E(C/„ iC - 
rr (*)\2 , n 

Un,c ) >K^oo U. 

The main goal of the previous step was the multiplicative separation of the random 
variables which are cumulated in h c . The aim of the second step is the approximation 
of h c ' whose representation is given by an infinite sum, by a function consisting of only 
finitely many summands. Similar to the foregoing part of the proof the approximation 
error n 2 E(lli K J — U^ C ' L ) 2 is reformulated in terms of Z n with kernel H := = h c K ^ — 
h^ ,L \ As before, we exemplarily take n~ 2 J2r=i ^n)- and sup fcgN E|_ff( L '(Xi,Xi + fc)| 2 
into further consideration. Concerning the summands of Z„}-, we obtain 



|Ei7 (L) (X l ,X j )H (L '> (X k ,Xi) - EH {L) (X U X )H {L) {X k ,X t )\ 
<E\H( L \x k ,X l )[H( L \x i ,X j )-H( L \x i ,X j )]t (xLtX ,y e[ _ BiB] ^ 

+ E|i/( L )(X fe ,X0[i/ (i H^,^)-^ (i H^,^)]l(^,A7)^[-B,B]-| 

+E|^ i )(x i) ^o[ff (i ^x fe) xo-ffW(x fe ,xo]i w ^p, e[ _ BiB]2li | 

+ E\H^\X l J ] )[H^\X k7 X l )-H^Hx k JO}\x^y^-BM^ 
= Ei + E2 + Eg + E4 

for arbitrary B > 0. Obviously, it suffices to take the first two summands into further con- 
siderations. The both remaining terms can be treated similarly. First, note that (H^)l 
is uniformly bounded. Since <fi and ip have compact support, the number of overlapping 

functions within ($o,fc)fce{-£,. ..,£}<< and (^jfk)ke{-L,...,L}<',o<j<J(K),e&E can be bounded 
by a constant that is independent of L. By Lipschitz continuity of cf> and this leads to 

uniform Lipschitz continuity of (h c K ' L Due to the reformulation 
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one can choose (B = B(K, L)) lgn such that m3x Ii!(e r_ B)B ]d \hc [x, y) — hi K ' L '(x, y)\ = 
and B(K,L) — >l^>oo oo. This setting allows for the approximations 

E 1 <Cr^[E\H^\X k ,X l )\ 1 ^- 5 h (x , >x ,y e[ ^^ 

E 2 < Cr s r [P{X 1 £ [-B,B] 2d )] l - S . 

Analogously, it can be shown that svLp kefi E[H^(X u X 1+k )} 2 < CP(Xi £ [-B, B] d ). 
Finally, we obtain 



supn 2 E(C/W - UgMf < C[P(X 1 £ [-B, B] d )] 



sup^(r + l)r 7 5 



— > 0. 



Hence, the relations (5.9) and (5.10) hold. □ 

Proof of Lemma 2.3. In order to prove the assertion, we follow the lines of the proofs 
of Lemma 2.1, Lemma 2.2, and Lemma 5.2 and carry out some modifications. 

In a first step, we reduce the problem to statistics with bounded kernels h c defined in 
the proof of Lemma 2.1. To this end, we use the modified approximation 

\H^(x,y) -H^(x,y)\ < [2f(x,x,y,y) + g(x,x) +$(y,y)][||z-x||, 1 + \\y-y\\h] 
=■ h(x,x,y,y)[\\x-x\\ h + \\y-y\\h], 

where g is given by g(x,x) := J Rd f(x, x, z, z)Px(dz). Under (A4)(i) Holder's inequality 
yields 



E\H^(Y kl ,Y k2 )-H^(Y k3 ,Y ki )\ 



< (®lfi(Y kl , Y k2 ,Y k3 ,Y kl )] 1 '^ WY* k j (E||y fcl - Y k3 \\ h + E\\Y k2 - Y ki \\ h ) s 

for Y ki (hi = 1, . . . , 5,i = 1, . . . , 4), as defined in (A4). Plugging in this inequality into the 
calculations of the proof of Lemma 2.1 yields sup neN n 2 E(U n — Un^) 2 — >c-kx 0. 

The next step contains the wavelet approximation of the bounded kernel h c . Defin- 
ing hi K ^ and U^J as in the proof of Lemma 2.2, analogous to the proof of Lemma 5.2 
there exists a C > such that 

\W\x,y)-W\x,y)\ 

< fi(x,x,y,y)[\\x-x\\ h + \\y-y\li,} + \H 2 (x, y) - H 2 (x, y)\ 

(5.11) 

< Cfi{x,x,y,y)[\\x- x\\ h + \\y - y\\ h ] 

+ {\ K kite(%,y^\\®j(K)M( x )®J{K)M{y) ~ $ j(K),fci(*) $ j(K),fc 2 (y)l), 

fci,fc 2 ez d 
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where K kl ,k 2 is given by 

K>k u k 3 (x,y) ■= I I ®J{K),kA u )®J(K),k 2 (v)[h c (u,v) - h c (x,y)]dudv 



and H 2 is defined as in the proof of Lemma 5.2. In order to approximate the last 
summand of (5.11), we distinguish again between the cases whether or not (x',y')' £ 
supp(<f>j(x) j, 1 x $ j(K),k 2 )- I n the first case, an upper bound of order 

0( r max f 1 (x,x + a 1 ,y,y + a 2 )){\\x-x\\i 1 +\\y-y\\ h ) 

can be obtained since 



\Kk uk2 ^y)\ < Q2e[ _^max^ /2;(/o]d / 1 ( i ,x + a 1 ^,,- + fl2 ) 

1 

< (2-W(W)) max /i^i + Oi.y.y + aa). 

oi,02e[-S*/2 J W,S^/2JW]'* 

Here, denotes the length of the support of </>. In the second decomposition 
similar to (5.8) can be employed which leads to the upper bound 

0[f 1 (x,x,y,y)+ max f 1 (x,x + a 1 ,y,y + a 2 ))(\\x-x\\i 1 + \\y-y\\ h ). 

Consequently, we get 

\hi K \x,y)-hf\x,y)\<0( max f 1 (x,x + a 1 ,y,y + a 2 ) 

+ max fx(x,x + ai,y,y+ a 2 ) 

a 1 ,a 2 e[-S 4> /2-'(K) y S <p /2-'('<)]" 

+ f 1 (x,x,y,y)j x (Hx-xlta + 

=: /a^Xjj/.^dlx-xlli! + \\y-y\\ h ). 

This yields | W (x, y) - W (x, y) | < / 3 (1, x, y , y) ( || 1 - x\\ h + \ \ y - y \ \ h ) with f 3 (x,x,y, 
y) = 2f 2 (x, x,y,y) + J Rd f 2 (x, x, z, z)P x (dz) + J Rd f 2 (z, z,y,y)P x (dz). Note that un- 
der (A4)(i), EUsi^Y^Y^Y^mWt, + \\Yj\U, + \\Y k \\ h + ||^|| (l ) < 00 if J(K) is suffi- 
ciently large. Thus, we have 

E\H^(Y kl ,Y k2 )^H^(Y k3 ,Y ki )\<C(E\\Y kl -Y k3 \\ h + E\\Y k2 - Y ki \\ h ) s 

for Y ki (ki = 1, . . . , 5, i = 1, . . . , 4), as defined in (A4). Moreover, Lemma 5.3 remains valid 
with g = h c . Therefore, one can follow the lines of the proof of Lemma 2.3 and plug in 
the inequality above. This procedure leads to sup neN n 

>K^oo 0. 
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In the third step of the proof, we verify sup JlgN n 2 HL{U^c — ui K c ' L ^) 2 — >l^>oo 0. For 
this purpose, it suffices to plug in a modified approximation of H^ L '(x,y) — H^(x,y) 
into the second part of the proof of Lemma 2.2. Lipschitz continuity of h[ K ' L ^ implies 

\H( L \x,y) - H^(x,y)\ < f A (x, x, y, y)[\\x - x\\ h + \\y- y\\ h ] 

with /^(x, 2, y, y) = C + f3(x,x,y,y). Since, f& satisfies the moment assumption of (A4)(i) 
with A = for sufficiently large J(K), we obtain 

E\H^(Y kl ,Y k2 ) - H^ L \Y k3 ,Y ki )\ <C[E(\\Y kl -Y k ,\\ h + \\Y k2 -Y k4 \\ h )] s . 

Hence, sup ngN n 2 E([/,lf^ — J7nfc ) 2 — >l->oo 0. Summing up the three steps yields 

lim limsuplimsupsupn 2 E(J7 n , — U^'^) 2 — 0. q 

Proof of Lemma 3.2. A positive variance of Z implies the existence of constants V > 
and Co > such that for every c > Co we can find a Kq £ N such that for every K > Kq 
there is an Lq with vax(Zc ) > V, VZ > Lq. Moreover, uniform equicontinuity of the 
distribution functions of (((Zc K ' LS> )l)k) c yields the desired property of Z. By matrices- 
based notation of Zc K ' L \ we obtain 

M{K,L) 

Z (K,L) _ q{K,L) , \ " (c,K,L) Z (K,L) Z (K,L) _ q(K,L) , r^(Jf,L)i'p(Jf,I,)^(Jsr,I,) 

c / j <k\,k2 ki k2 lie 
kl ,/C2 — 1 



with a constant C^ K ' L \ a symmetric matrix of coefficients T C K ' L \ and a normal vector 

7 {K,L) v 
' ' ^M(K,L)> 



Z (K,L) = . . . ( Zj££i0'- Hence, zP' L) - C*^ L ) can be rewritten as follows: 



Z (K,L) _ C (K,L) £ y!p(KX)^< A (KX)Tj(K,L)y = yl ^{K,L)y 
M{K,L) 

= E V tf. 

fe=i 

Here ui K,L ^ is a certain orthogonal matrix, := diag(A 1 c '~ ff '' L \ . . . , ^m(kl)) w ith 

| A (cJC,i)| > . . . > |A^2)I. and F as well as y are multivariate standard normally dis- 
tributed random vectors. For notational simplicity, we suppress the upper index (c, K, L) 
in the sequel. Due to the above choice of the triple (c,K,L), either ^2 k=1 (^k) 2 or 

2fef=5 L i^k) 2 is bounded from below by V/A. In the first case, Ai > y/V/16 holds true 
which implies 

P(Zi K ^£[x-e,x + e})<^ / AlY 2(i)dt<P(y l 2 <2e)max| 1,-^| Wx £ R. 

Here, the first inequality results from the fact that convolution preserves the continuity 
properties of the smoother function. In the opposite case, that is, Y^k=5 ^ i^k) 2 > V/4, 
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it is possible to bound the uniform norm of the density function of Zc by means of 
its variance. To this end, we first consider the characteristic function ip„(K,L) of Zc K ' L ^ 

and assume w.l.o.g. that M(K,L) is divisible by 4. Defining a sequence (/iiO^f'^ 4 by 
/ifc = Xik for k € {1, . . . , M(K, L)/A} allows for the approximation: 

f M(K,L) "\ -V 4 (M(K,L)/4 \ -1 

\<Pzi*Mt)\=\ n (i+i^jt] 2 )\ < n a+M 2 ) 



< 



3 = 1 ) K 3 = 1 

1 



1+4( M 2 + ■■■+^\ [{KL)/A W 

By inverse Fourier transform, we obtain the following result concerning the density func- 
tion of Zc : 

11/^) Hoc < ^ll^tK^) ||l <^ / 



1 + ( 2 ^l + ---+f4l(K,L)/4 t y 



1 1 r 00 1 , 

du 



< 



La j [ ,,2 27t 7 1 + u 

yMi+ +M M (if,L)/4 u 

1 

2 v /4( M 2 + ... + ^ ( ^ i)/4 _ i ) 



1 1 
< ^^^^^^^^= < 



Thus, -P(z" c ' e [x — e,x + e}) < 2e/W which completes the studies of the case 
J2k=5'' L \^k) 2 > V/4 and finally yields the assertion. □ 

Proof of Lemma 3.1. This result is an immediate consequence of Theorem 3.1. □ 
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